Small Domain Randomization: Same Privacy, More Utility
نویسندگان
چکیده
Random perturbation is a promising technique for privacy preserving data mining. It retains an original sensitive value with a certain probability and replaces it with a random value from the domain with the remaining probability. If the replacing value is chosen from a large domain, the retention probability must be small to protect privacy. For this reason, previous randomizationbased approaches have poor utility. In this paper, we propose an alternative way to randomize sensitive values, called small domain randomization. First, we partition the given table into sub-tables that have smaller domains of sensitive values. Then, we randomize the sensitive values within each sub-table independently. Since each sub-table has a smaller domain, a larger retention probability is permitted. We propose this approach as an alternative to classical partition-based approaches to privacy preserving data publishing. There are two key issues: ensure the published sub-tables do not disclose more private information than what is permitted on the original table, and partition the table so that utility is maximized. We present an effective solution.
منابع مشابه
Preserving Categorical Data Analysis
Ling Guo. Randomization Based Privacy Preserving Categorical Data Analysis. Under the direction of Dr. Xintao Wu The success of data mining relies on the availability of high quality data. To ensure quality data mining, effective information sharing between organizations becomes a vital requirement in today’s society. Since data mining often involves sensitive information of individuals, the pu...
متن کاملLimiting Attribute Disclosure in Randomization Based Microdata Release
Privacy preserving microdata publication has received wide attention. In this paper, we investigate the randomization approach and focus on attribute disclosure under linking attacks. We give efficient solutions to determine optimal distortion parameters, such that we can maximize utility preservation while still satisfying privacy requirements. We compare our randomization approach with l-dive...
متن کاملDifferentially Private Local Electricity Markets
Privacy-preserving electricity markets have a key role in steering customers towards participation in local electricity markets by guarantying to protect their sensitive information. Moreover, these markets make it possible to statically release and share the market outputs for social good. This paper aims to design a market for local energy communities by implementing Differential Privacy (DP)...
متن کاملDifferential Privacy: On the Trade-Off between Utility and Information Leakage
Differential privacy is a notion of privacy that has become very popular in the database community. Roughly, the idea is that a randomized query mechanism provides sufficient privacy protection if the ratio between the probabilities that two adjacent datasets give the same answer is bound by eǫ. In the field of information flow there is a similar concern for controlling information leakage, i.e...
متن کاملAlgorithm-irrelevant Privacy Protection Method Based on Randomization
Privacy preserving classification mining is one of the fast-growing subareas of data mining. The algorithm-related methods of privacy-preserving are designed for particular classification algorithm and couldn’t be used in other classification algorithms. To solve this problem, it proposes a new algorithm-irrelevant privacy protection method based on randomization. This method generates and open...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 3 شماره
صفحات -
تاریخ انتشار 2010